17 research outputs found
Private Aggregation from Fewer Anonymous Messages
Consider the setup where parties are each given a number and the goal is to compute the sum in a secure
fashion and with as little communication as possible. We study this problem in
the anonymized model of Ishai et al. (FOCS 2006) where each party may broadcast
anonymous messages on an insecure channel.
We present a new analysis of the one-round "split and mix" protocol of Ishai
et al. In order to achieve the same security parameter, our analysis reduces
the required number of messages by a multiplicative factor. We
complement our positive result with lower bounds showing that the dependence of
the number of messages on the domain size, the number of parties, and the
security parameter is essentially tight.
Using a reduction of Balle et al. (2019), our improved analysis of the
protocol of Ishai et al. yields, in the same model, an -differentially private protocol for aggregation that, for any
constant and any ,
incurs only a constant error and requires only a constant number of messages
per party. Previously, such a protocol was known only for
messages per party.Comment: 31 pages; 1 tabl
Recommended from our members
Anonymisation of geographical distance matrices via Lipschitz embedding
BACKGROUND: Anonymisation of spatially referenced data has received increasing attention in recent years. Whereas the research focus has been on the anonymisation of point locations, the disclosure risk arising from the publishing of inter-point distances and corresponding anonymisation methods have not been studied systematically.
METHODS: We propose a new anonymisation method for the release of geographical distances between records of a microdata file-for example patients in a medical database. We discuss a data release scheme in which microdata without coordinates and an additional distance matrix between the corresponding rows of the microdata set are released. In contrast to most other approaches this method preserves small distances better than larger distances. The distances are modified by a variant of Lipschitz embedding.
RESULTS: The effects of the embedding parameters on the risk of data disclosure are evaluated by linkage experiments using simulated data. The results indicate small disclosure risks for appropriate embedding parameters.
CONCLUSION: The proposed method is useful if published distance information might be misused for the re-identification of records. The method can be used for publishing scientific-use-files and as an additional tool for record-linkage studies
Generating labels from clicks
The ranking function used by search engines to order results is learned from labeled training data. Each training point is a (query, URL) pair that is labeled by a human judge who assigns a score of Perfect, Excellent, etc., depending on how well the URL matches the query. In this paper, we study whether clicks can be used to automatically generate good labels. Intuitively, documents that are clicked (resp., skipped) in aggregate can indicate relevance (resp., lack of relevance). We give a novel way of transforming clicks into weighted, directed graphs inspired by eye-tracking studies and then devise an objective function for finding cuts in these graphs that in-duce a good labeling. In its full generality, the problem is NP-hard, but we show that, in the case of two labels, an optimum labeling can be found in linear time. For the more general case, we propose heuristic solutions. Experiments on real click logs show that click-based labels align with the opinion of a panel of judges, especially as the consensus of the panel grows stronger
Noiseless database privacy
The notion of differential privacy has recently emerged as a gold standard in the field of database privacy. While this notion has the benefit of providing concrete theoretical privacy (compared to various previous ad-hoc approaches), the major drawback is that the mechanisms needs to inject some noise the output limiting its applicability in many settings. In this work, we initiate the study of a new notion of privacy called noiseless privacy. The (very natural) idea we explore is to exploit the entropy already present in the database and substitute that in the place of external noise to the output. The privacy guarantee we provide is very similar to DP but where that guarantee “comes from ” is very different in the two cases. While differential privacy focusses on generality, we make assumptions about the database distribution, the auxiliary information which the adversary may have and the type of queries. This allows us to obtain “privacy for free ” whenever the underlying assumptions are satisfied. In this work, we first formalize the notion of noiseless privacy, introduce two definitions and show that they are equivalent. We then study certain types of boolean and real queries and show natural (and well understood) conditions under which noiseless privacy can be obtained with good parameters. We also study the issue of composability and introduce models under which it can be achieved in the noiseless privacy framework.